Why You May Not Want to Correct “Bias” in an AI Model

FACCT (Fairness, Accountability, and Transparency) is an ACM-sponsored, cross-disciplinary effort to study “fairness” in socio-technical systems.

A 2021 paper “On the Dangers of Stochastic Parrots: Can Language Models be Too BigBender et al. (2021) argues that the drive toward ever-larger training sets can be bad for “fairness” because the data will inevitably include “derogatory” language or ideas that are no longer part of acceptable discourse.

Yoav Goldberg publishes a critique (@yoavgo) with several counter-arguments including:

  1. The paper attacks the wrong target

The real criticism is not about model size, it’s about any language model. Framing it about size is harmful.

Any model, no matter how big or small, will reflect the input you give it. If you have a problem with the results of a model, look not at the size but at the data itself.

  1. The paper takes one-sided political views, without presenting it as such and without presenting the alternative views

It’s obvious that the people writing and enjoying this paper have specific, left-wing political views that they infuse into their work. You can argue whether their bias is better or worse than other kinds of bias, but we shouldn’t give them the claim to objectivity they demand.

References

Bender, Emily M., Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜.” In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–23. Virtual Event Canada: ACM. https://doi.org/10.1145/3442188.3445922.